Hierarchical Concept Description and Learning for Information Extraction

نویسندگان

  • Luo Xiao
  • Dieter Wissmann
  • Michael Brown
  • Stefan Jablonski
چکیده

This paper addresses the problem of extracting information from textual documents, either normal documents or web pages. A new approach for extracting complicate information from semi-structured documents is introduced that exploits a successive hierarchical rule-learning algorithm. Through evaluation it is shown that this approach can extract complicate concepts with a much higher precision than the equivalent rule learning applied to flat text. In addition, the rate of learning is significantly higher for the hierarchical approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Learning for Hierarchical Wrapper Induction

Information mediators that allow users to integrate data from several Web sources rely on wrappers that extract the relevant data from the Web documents. Wrappers turn collections of Web pages into database-like tables by applying a set of extraction rules to each individual document. Even though the extraction rules can be written by humans, this is undesirable because the process is tedious, ...

متن کامل

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

Ontology Enhancement for Including Newly Acquired Knowledge About Concept Descriptions and Answering Imprecise Queries

This chapter presents a text-mining-based ontology enhancement and query-processing system. The key ideas introduced here are that of learning and including imprecise concept descriptions into ontology structures. This is essential for ontology-based text information extraction since it is not necessary that text description of the concepts or user-specified descriptions will exactly match stor...

متن کامل

Learning Fuzzy Concept Hierarchy and Measurement with Node Labeling

A concept hierarchy is a kind of general form of knowledge representations. Since concept description is generally vague for human knowledge, crisp description for a concept usually cannot represent human knowledge completely and practically. In this paper, we discuss fuzzy characteristics of concept description and relationship. An agglomerative clustering scheme is proposed to learn hierarchi...

متن کامل

The Cruncher: Automatic Concept Formation Using Minimum Description Length

We present The Cruncher, a simple representation framework and algorithm based on minimum description length for automatically forming an ontology of concepts from attribute-value data sets. Although unsupervised, when The Cruncher is applied to an animal data set, it produces a nearly zoologically accurate categorization. We demonstrate The Cruncher’s utility for finding useful macro-actions i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001